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ABSTRACT 


The  rate  of  convergence  of  line  search  algorithms  based  on  general 
interpolating  functions  is  derived,  and  is  shown  to  be  independent  of  the 
particular  interpolating  function  used.  This  result  holds  for  the  root 
finding  problem  f(x)  =  0  as  well.  We  show  how  inverse  interpolation  can 
be  used  in  conjunction  with  the  line  search  problem,  and  derive  its  rate  of 
convergence.  Our  analysis  suggests  that  one-point  line  search  algorithms 
(in  particular  Newton's  method)  are  inefficient  in  a  sense.  Two-point 
algorithms  using  rational  interpolating  functions  are  recommended- 
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1 .  INTRODUCTION 


An  essential  part  of  multidimensional  minimization  algorithms  is 
a  line  search,  i.e.,  a  one-dimensional  scheme  for  the  solution  of  the 
equation 

(1)  f'(x)  =  0. 


Most  of  the  line  search  algorithms  in  common  use  are  based  on 
polynomial  interpolation  of  f.  At  iteration  i,  a  polynomial  P_  (x)  (the 

U  y  S 

so-called  byperosculatory  interpolation  polynomial)  which  coincides  with 
f  and  its  derivatives  up  to  order  s-1 ,  at  each  of  the  n+1  interpolation 
points  x^,  ,  ....  Xj_n>  is  constructed.  The  new  interpolation  point 

x^+^ ,  is  the  solution  of 


(2) 


(X1+1 


)  = 


0. 


In  fact,  to  facilitate  the  solution  of  (2),  a  low  degree  polynomial 
is  fitted,  i.e.,  r  =  s(n+1)  is  small;  quadratic  and  cubic  fit  being  most 
commonly  used. 

In  recent  years,  the  possibility  of  using  nonpolynomial  interpolation 
functions  received  some  attention.  One  import  suit  situation  arises  in  line 
searches  associated  with  n-dimensional  constrained  problems,  solved  by 
barrier  function  methods.  A  fit  by  a  polynomial  cannot  capture  the 
singular  behavior  of  the  barrier  objective  function  at  the  boundary  of  the 
feasible  region.  Wright  {20]  dealt  with  the  case  of  the  logarithmic  barrier 
function.  She  suggests  using  the  interpolating  functions 


(3)  ax  +  b  +  r  log(x-c) 

2 

(4)  ax  +  bx  +  c  +  r  log  (x-d). 


1 


2 


Bjdrstad  and  Nocedal  (33  analyze  the  rate  of  convergence  of  an 
algorithm  based  on  the  interpolating  function 

2  . 

(5)  ax  +  bx  +  c 
(dx  +  I)2 


This  function  is  the  one-dimensional  restriction  of  the  "conic"  model 
function  suggested  by  Davidon  (53,  who  lists  some  important  advantages  of 
the  conic  model  over  the  quadratic  one. 

Independently,  we  suggested  [1}  another  rational  interpolating 

function 

2 

ax  +  bx  +  c 


which  we  analyze  in  section  3. 

Nonpolynomial  interpolation  was  suggested  much  earlier  for  the  root 
finding  problem 

(7)  f(x)  =  0. 

Ostrowski  (13,  p.  82]  used  in  this  conjunction  the  rational  function  , 

which  Jarratt  and  Nudds  [8]  and  Jarratt  (93  generalized  to 


where  Q(x)  is  a  polynomial.  Ben-Tal  and  Ben-Israel  (2]  describe  nonpolynomial 
interpolations  by  certain  types  of  generalized  convex  functions. 

We  formally  define  the  Tn  g-  interpolation  algorithm  as  follows . 

Let  n>0,  s>  1  be  fixed  integers  and  let  T  be  a  family  of  s-1  times 
differentiable  functions  T : R-+R ,  depending  on  r  =  s(n+1)  parameters.  At 


3 


iteration  i,  the  points  x^,  x^_^,  ....  x^_q  are  given,  and  a  function 
T  e  *f  is  chosen  so  as  to  satisfy  the  interpolation  equations 

(9)  T(k)(x  )  =  f(k)(x  }  j  =  0,  ...n;  k  =  0,  ....  s-1. 

A  new  interpolation  point  is  computed  from 

(10)  T'(xi+1)  =  0  , 

and  the  oldest  point  x.  is  deleted. 

l-n 

The  practicality  of  using  a  particular  class  ¥  depends  to  a  great 
extent  on  the  degree  of  difficulty  of  solving  the  (generally  nonlinear)  system 
of  equations  (9)  and  equation  (10).  In  the  case  of  the  logarithmic  functions 
(3)  and  (4),  equation  (10)  is  easy  to  solve,  but  (9)  is  an  ill-conditioned 
nonlinear  system  of  equations.  Wright  [20]  uses  table  look-ups,  and  applies 
Newton's  method  after  operating  some  transformations  on  these  equations,  in 
order  to  solve  them. 

For  the  conic  model  studied  by  Bjrfrstad  and  Nocedal  [3],  equations  (9) 
and  (10)  can  be  reduced  to  quadratic  equations,  while  for  the  rational  function 
(6)  discussed  in  section  3,  equations  (9)  are  reduced  to  a  linear  system,  and 
(10)  is  very  easy  to  solve. 

Note  that  in  the  polynomial  case,  (9)  is  a  linear  system.  However, 

(10)  is  difficult  to  solve  unless  T  is  a  low  degree  polynomial.  It  will  be 
shown  in  section  4,  that  this  difficulty  can  be  circumvented  by  employing 
inverse  interpolation. 

Note  also,  that  for  the  function  (8),  the  interpolation  equations  can 
be  reduced  to  a  linear  system,  while  the  solution  of  T(xi+1)  =  0  is  simply 


k 


In  this  paper  we  investigate  the  rate  of  convergence  of  these 
minimisation  algorithms.  Here  we  say  that  the  rate  of  convergence  of  a 
sequence  {x^}  converging  to  a  is  p ,  if  there  exists  a  positive  number  C, 
such  that 


(see  [19,  pp.  1-131).  Ortega  and  Rheinboldt  [ll]  refer  to  the  rate  p  defined 
above  as  the  C-order  of  the  sequence  {x^} .  When  it  exists,  it  coincides  with 

their  so-called  Q-  and  R-  orders  (see  [ll,  section  9]). 

Rate  of  convergence  analysis  is  supplied  by  Bjdrstad  and  Nocedal  [3] 
for  the  conic  function  with  s  =  2,  n  =  1.  The  derivation,  which  uses  a 
symbol  manipulation  computer  program,  is  quite  elaborate.  Moreover,  the 
analysis  does  not  carry  over  naturally  to  the  study  of  the  convergence 
properties  of  an  algorithm  using  the  same  interpolation  function,  but  with 
different  data  say  s  =  1,  n  =  3. 

Wright  [20]  gives  no  rate  of  convergence  analysis  for  the  algorithms 
using  the  logarithmic  interpolating  functions  (3)  and  (4). 

An  outline  of  the  paper  is  as  follows.  In  section  2  we  prove  rate  of 
convergence  theorems  for  general  T  •  interpolation  methods .  We  show  that  the 

<vu  J  S 

rate  of  convergence  is  given  by  the  unique  positive  root  of  the  indiclal 
equation 

(11)  tn+1  -  (s-l)tn  -  s  £  tJ  =  0  . 

j=0 

Since  this  equation  depends  on  n  and  s  only,  the  rate  is  independent  of  the  class  T. 


In  section  3  we  analyze  the  specific  family  of  interpolating  functions 


2 

,,x  ax  +  bx  +  c 

(6)  — d bTTT — 

Inverse  interpolation  for  minimization  algorithms  is  introduced  in 
section  4.  We  show  that  the  rate  of  convergence  in  this  case  is  again  given 
as  the  positive  root  of  (11). 

Numerical  examples  illustrating  the  convergence  theorems  are  given 
in  section  5. 

In  section  6  we  discuss  the  implications  of  the  rate  of  convergence 


analysis  to  the  design  of  algorithms. 
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2.  RATES  OF  CONVERGENCE  OF  NONPOLYNOMIAL  ALGORITHMS 

Traub  [19]  studied  the  rate  of  convergence  of  algorithms  that  use 
polynomials  to  interpolate  f ,  or  its  inverse  function  for  the  root  finding 
problem  (7).  The  natural  modifications  of  these  results  for  the  minimization 
problem  are  discussed  by  Tamir  [l7,  18]  for  the  direct  polynomial  case,  i.e., 
when  the  interpolation  requirements  are  given  by  (9),  T  being  a  polynomial 
of  degree  <  r  =  s(m-l). 

The  key  result  for  this  analysis  is  the  product  form  formula  of  the 
error  incurred  in  hyperosculatory  polynomial  interpolation  (e.g.  [6,  p.  67]). 
Ostrovski  [l3,  p.  12]  generalized  this  formula  to  the  case  where  the 
interpolating  function  is  not  necessarily  a  polynomial.  However,  no  use  of 
this  generalized  formula  has  been  made  to  extend  the  analysis  of  Traub  and 
Tamir  to  the  nonpolynomial  case.  Using  this  formula,  we  will  obtain  a 
difference  equation  which  differs  from  the  one  obtained  by  Tamir  in  its 
right  hand  side  only.  This  implies  that  in  the  nonpolynomial  case  too, 
the  rate  is  given  by  the  positive  root  of  the  indicial  equation  (11). 

Tamir  [17,  18]  gives  two  separate  proofs  for  the  cases  s  =  1,  s  >  1.  We 
will  give  a  unified  proof,  and  settle  his  conjectures  in  [17]. 

Stronger  results  than  ours  can  evidently  be  obtained  by  relaxing 
some  of  our  assumptions  (compare  for  example  Brent  [ 4  ]  ) .  We  have  preferred, 
however,  to  keep  the  presentation  unobscured  by  these  technicalities.  For 
the  same  reason,  we  have  not  stated  explicitly  the  interval  of  (local) 
convergence.  This  is  done  in  great  detail  in  [17]  and  repeated  in  [18] . 

We  will  denote  by  a  a  solution  of  (1),  and  by  J  the  interval 
J  =  (x:  jx  -  a |  c  L}  for  some  positive  L.  The  error  x^  -  a  will  be  denoted 
by  e.  and  the  open  interval  determined  by  (a, ,  ...,  a  }  will  be  denoted  by 
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The  following  assumption  will  be  used  repeatedly. 


Assumption  1  r  =  s(n+l)  >  3;  f  and  T  have  continuous  derivatives  of 
order  r  +  1  in  J  for  all  T  e  5  ;  f"(a)  ^  0  ;  x^  e  J  and  x.j  ^  x^  for 
j  ^  k,  i, j,  k  =  0,  1,  2,  . . . ,  n  ;  e^  i-  0  for  all  k. 

Note  that  if  e^  =  0  for  some  i,  x.^  is  a  solution  of  (1),  and  the 
algorithm  is  terminated. 

In  order  that  the  sequence  {x^}  defined  by  the  algorithm  be  well 

defined,  the  interpolation  equations  (9),  as  well  as  equation  (10)  for 

x,  , ,  must  have  solutions.  If  T  is  the  class  _  of  polynomials  of  degree 
l+x  n  f  s 

less  than  r  =  s(n+l),  equations  (9)  have  a  solution  if  and  only  if 

x^  4  x^  for  k  ^  i.  To  quote  Davis  [6,  p.  2?],  the  hope  that  an  interpolation 

problem  can  always  be  solved  providing  the  number  of  parameters  equals 

the  number  of  conditions,  is  naive.  T  can  be  replaced  by  P_  in  iterations 

n,s 

at  which  (9)  has  no  solution,  but  in  practice  this  case  is  rather  unlikely. 

We  will  assume  henceforth  that  (9)  has  a  solution  for  all  i. 

As  for  equation  (10),  we  will  prove  that  under  Assumption  1,  it  has 
a  solution  for  all  i,  if  L  is  small  enough.  We  need  the  following  difference 
relation  to  prove  this  and  other  results. 


Theorem  1  Under  Assumption  l,  if  T"  /  0  on  J,  then  the  errors  e^  =  x1  -  a, 
induced  by  the  TQ  g-  interpolation  algorithm,  satisfy  the  recursion  equation: 


(12) 


Vi  =  Mi 


n 

E 

k=0 


s-i 

i-k 


*  ei_j 

j=0  1  3 

J/k 


+ 


n 

*  *l3 

j=0  1  J 
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where 

Mj a j (a) ) (-1) r_1 • s  (*) (a) ) (-l)r 

Mi  =  T"(e(xi+i))  ’  Ni  =  T"(0(x1+1)) 

Ml(x)  = - ,  „.U)  =  f(rfl)(i)  -  T(r+1)(x) 

r'  1  (r+1)  [ 


£^(t),  n^(t)  e  <t,  x  ,  x  . ..,  n> 


and 


©(x^+^)  e  xi+-^>  . 

Proof;  The  error  in  the  interpolation  (9)  is  given  by  (see  [13,  p.  12]  ) 
(13)  T(t)  =  f(t)  +  T(l>)(^-~  •i;(r)^  .fQ  (t  -  x±^)B  . 


Differentiating  (13)  we  have 

(14)  f'(t)  =  T'(t)  +  M1(c±(t))  W (t)  +  N1(ni(t))  W(t)  , 

n 

where  W(t)  =  IT  (t  -  x.  .  )s  and  M. ,  N. ,  £  (t)  and  n,(t)  are  defined  above 

J— Q  111  1 

(for  proof  see  [l,  section  41  where  we  generalize  Ralston's  result  [l4,  15]  on  the 
differentiation  of  the  error  term,  to  the  hyperosculatory  case).  Substituting 
t  =  a  in  (14)  and  using 

T'(ot)  =  T'(o)  -  T'(xi+1)  =  -ei+1  T"(©(x1+1))  we  obtain  (12). 

□ 
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Under  Assumption  1,  f"(a)  i-  0.  Since  f '  (ot )  =  0,  f1  must  change  its  sign  at 
a.  It  follows  by  substituting  t  =  a  -  L  and  t  =  a  +  L  in  (14) ,  that  T*  also 
has  opposite  signs  at  these  two  points  (for  a  detailed  proof  see  Appendix  A 
in  [18]),  if  L  is  small  enough.  We  summarize  this  result  in 

Theorem  2  Under  Assumption  1,  if  L  is  small  enough,  there  exists  x^+^  e  J 

satisfying  equation  (10). 


Using  Theorem  1  in  (7,  chapter  6,  section  5l>  it  follows  immediately 
from  the  difference  equation  (12),  that  if  the  initial  errors  e0,  ...,  are 
small  enough  (i.e.,  L  is  small  enough),  the  sequence  e^  tends  to  zero,  establishing 
the  following  local  convergence  result. 


Theorem  3  Under  Assumption  1,  if  L  is  small  enough,  the  sequence  {x^}  converges 
to  the  solution  a  of  (1).  Q 

Also  note  that  if  L  is  small  enough,  and  if  s  >  1,  we  have  by  (12) 

| ei+i I  <  lejj  >  implying  x^+^  t  .  For  s  =  1  however,  we  have  to  assume 

xi+l  *  xi  C16])- 

We  now  replace  (12)  by  a  more  useful  difference  equation. 


Theorem  4  Under  the  assumptions  of  Theorem  3,  and  if  the  sequences  (M^)  ,  {Nj 
are  bounded,  then 


(15) 


ei+i  =  Ai 


s-i  n  s 
i  ei-1 

j=l  J 


with  {A  }  bounded. 

Proof .  By  assumption,  the  sequences  M^,  are  bounded.  If  s  >  2,  (12)  implies 


e 


e 


1+1 

i 


0  , 


(16) 
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(i.e.  superlinear  convergence).  If  s  =  1,  we  must  have  n  >  2  since  we  assumed 
r  —  s(n+l)  >  o.  For  n  =  2,  (12)  is  the  basic  difference  relation  governing 
the  behavior  of  the  Quadratic  Fit  algorithm,  which  is  known  to  converge 
super linearly  (see  Theorem  3.4.1  in  Brent  [4]).  it  is  evident  from  (12)  that 
the  rate  for  n  >  2  is  not  less  than  the  rate  for  n  =  2.  Therefore,  (16)  holds 
for  all  s>l,  n  >  0  if  r  =  (n+l)s  >  3.  Rewriting  (12)  in  the  form 


(17) 


8-1 


ei+l  ei 


n 

Tt  e 

j=I 


i-j 


Mi+Mi 


z 

k=l 


'i-k 


Vi 


we  see  by  (16)  that  (15)  holds  with 


Ai  =  Mi 


n  e. 


k=l  i-k 


+  Vi 


□ 


Remark  In  [ 17 ,  Appendix  C] ,  Tamir  conjectures  that  his  apriory  assumption 
( 17 ,  Assumption  2  ]  on  the  superlinear  convergence  of  the  sequence  (e^ }  is 
redundant.  Our  proof  shows  that  this  assumption  is  indeed  redundant. 

We  now  state  our  main  result. 

Theorem  5  Under  the  assumptions  of  Theorem  4,  the  sequence  {xi>  generated 

by  the  interpolation  algorithm  converges  to  the  solution  a  of  (1),  with 

Q-  and  R-rates  of  convergence  at  least  p,  where  p  is  the  unique  positive 

root  of  the  equation  . 

tn+1  -  (s-i)tn  -  sjry  =  0  . 

j=0 

PrPQf  Convergence  of  [x^  to  a  is  proved  in  Theorem  3. 

Tamir  [17,18]  proves,  under  the  additional  assumption  •¥■  A  4  0, 
that  the  C-  (and  therefore  Q-  and  R-)  rate  is  exactly  p. 


11. 


Now  *  A  i  0  is  the  worst  possible  case,  for  if  the  sequence  [A^] 
is  bounded  (not  necessarily  convergent),  equation  (15)  implies  Q-rate  of  con¬ 
vergence  at  least  p  (even  though  the  C-order  may  fail  to  exist). 

Indeed,  the  proof  in  [17,18]  is  based  on  a  similar  result  of  Traub  [19] 
for  the  root  finding  problem.  We  will  indicate  the  slight  modifications  necessary 
in  the  proof,  for  the  latter  problem.  First,  it  is  evident  that  if  the  sequence 
of  Theorem  3-1  in  [19]  is  bounded  above,  so  is  the  sequence  [cr^].  This 

in  turn  implies  that  the  sequence  (D  ]  of  Theorem  3-3  of  [19]  is  bounded  above, 

|e  |  1 

or  equivalently  lim  sup  — <  +  «>,  hence  the  Q-order  of  the  sequence  {x^} 


is  at  least  p. 

The  assertion  on  the  R-order  follows  from  Theorem  9.3.2  in  [11,  section  9]. 


□ 


Remark  If  the  mapping  from  x^  n’*’'’Xi  t0  t*ie  Parameters  T  defined 
by  (9)  is  continuous,  the  sequences  (M^)  ,  {N^}  are  convergent,  hence  bounded, 
as  assumed  in  Theorem  5 . 

Corollary  The  rate  of  convergence  of  the  sequence  generated  by  the  interpolation 
algorithm  does  not  depend  on  the  class  of  interpolating  functions  T. 

Remark  It  is  evident  from  our  analysis  that  the  above  corollary  holds  for 
the  root  finding  problem,  as  well  as  for  the  case  when  the  number  of  pieces  of 
information  used  at  the  interpolation  point  depends  on  j  (e.g.  the  False 
Position  Method) . 

It  follows  from  Theorem  5,  that  the  rates  of  convergence  of  the  interpola¬ 
tion  algorithms  using  the  conic  interpolating  function  (5)  is  p  =1.46  for  s  =  1, 
n  =  3  (4  interpolation  points  with  no  derivatives);  p  =  2  for  s  =  2,  n  =1 


12. 


(f  and  f'  used  at  two  points)  and  p  =  3  for  s  =  4,  n  =  0  (f,  f',  f",  f'"  used 

at  one  interpolation  point) .  Rates  of  convergence  of  algorithms  using  the 

interpolating  functions  mentioned  in  the  introduction,  can  be  computed  likewise. 

The  behavior  of  the  rate  p  as  a  function  of  n ,  for  fixed  s ,  is  summarized 
in  Theorem  6. 

Theorem  6  For  fixed  s,  p  is  an  increasing  function  of  n.  For  n=0,  p  =  s  -  1 

while  for  n  =  1,  p  =  s.  As  n  tends  to  infinity,  p  tends  to  ^  + 
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Proof  For  n  =  0,  n  =  1,  the  rate  is  obtained  by  solving  the  indicial  equations 
t  -  (s-l)  =  0  and  t  -  (s-l)t  -  s  =  0  respectively.  The  remaining  assertions 
are  proved  in  Tamir  (17,  181.  o 

A  few  numerical  values  for  p,  are  listed  in  Table  2.1. 

TABLE  2.1 


s 

— 

n 

p 

1 

2 

1.3 

3 

1.4 

00 

1.6 

2 

1 

2 

2 

2-3 

00 

2.4 

3 

0 

2 

1 

3 

00 

3.3 

s 

0 

S-l 

1 

s 

oo 

i + V(t)2  * 1 

i  .jA 
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Since  ^  +  V®  +  1  is  close  to  s  even  for  small  values  of  s 
(see  Table  1),  Theorem  6  Implies  that  algorithms  using  more  than  two  interpolation 
points  (n  >  1)  are  inefficient.  However,  two  points  algorithms  are  substantially 
faster  than  one  point  (or  memoryless)  algorithms.  Instead  of  making  the  last 
statement  precise  by  defining  a  measure  of  efficiency  (chosen  carefully  to  suit 
the  authors'  purpose),  we  will  note  that  the  transition  from  n  =  0  to  n  =  1 
involves  storage  (but  no  computation)  of  s  extra  pieces  of  data.  In  addition  to 
this,  the  system  of  equations  (9)  will  involve  2s  instead  of  s  unknowns. 

However,  this  system  is  linear  in  the  polynomial  and  rational  cases  (which  are 
the  most  important  ones)  and  need  to  be  solved  once  only  for  the  class  T.  The 
main  difficulty  is  the  solution  of  equation  (10).  This,  in  the  case  of  s  =  3, 
n  =  1  (Newton's  Method  with  memory)  with  polynomial  interpolation,  is  a  polyno¬ 
mial  equation  of  degree  4.  Solution  of  this  equation  can  be  avoided  by  using 

inverse  interpolation,  to  be  discussed  in  section  4.  On  the  other  hand,  for  line 

(k) 

search  algorithms,  computation  of  f  (t)  involves  in  fact  computation  of  the 
derivatives  of  a  function  on  Rn  (i.e.,  gradient  vectors  and  Hessian  matrices,) 
making  the  extra  effort  worthwhile. 
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3.  A  CLASS  OF  RATIONAL  INTERPOLATING  FUNCTIONS 

In  this  section  we  briefly  discuss  the  four  parameter  rational  interpolating 
function 

2 

/m\  „/xax+bx  +  c 

(18)  R(x)  =  - ^ -  . 

Writing  (18)  in  the  form 

(19)  (dx-l)R(x)  =  ax^  +  bx  +  c  , 

differentiating  (19)  implicitly  and  then  using  the  interpolation  equations  (9), 
leads  to  a  linear  system  of  equations  for  the  coefficients  a, b, c,d.  For  ex¬ 
ample,  with  data  s  *  4,  n  *  0,  the  equations  are 

(dx^-l)f (x^)  =  ax^  +  bx^  +  c 
(dxjL-l)f '  (xt)  +  df(xt)  *  2ax1  +  b 
(dxt-l)f" (Xl)  +  2df'(x1)  -  2a 
(dxi-l)f,B  (x±)  +  Sdf"^)  *  0  . 

Note  that  if  d  =  0,  R(x)  has  no  singularity.  Therefore,  it  may  be  expected 
that  R(x)  will  provide  a  good  fit  to  functions  with  regular  or  singular  behavior. 

We  now  turn  our  attention  to  the  solution  of  (10)  for  x^  .  If  d*0,  R(x) 
is  a  quadratic  and  (10)  yields  x^+^  ■  -  ^ .  For  d  ^  0,  it  is  convenient  to  re¬ 
write  R(x)  in  the  form 

JL. 

x-8 


(20) 


R(x)  ■  ax  +  8  + 
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**"  a-|,  V-i+4  +  4.  a-i. 

a  d  d 


Differentiating  (20)  we  have 


R'  (x)  =  a  -  ■ 


R"  (x)  = 


(x-6 )‘ 


(x-5)' 


From  (22)  we  see  that  R"  (x)  has  exactly  one  change  of  sign  at  x-b.  The 
point  xi+j  will  be  a  minimum  of  f  if 


R' ^Xi+1^  =  ° 


R"(xi+1)  >  0 


From  (21) -(24)  we  have 


x1+1  =  6  ±  'Jy/a, 


assuming 


ay  >  0 


The  two  solutions  in  (25)  correspond  to  the  minimum  point  of  the  convex 
branch  of  R,  and  the  maximum  point  of  the  concave  branch  of  R.  Multiplying 

A 

(22)  by  (x-a)  ,  we  see  that  in  order  for  (24)  to  hold,  we  must  have  V(x^+^  -6)  > 
which  confined  with  (25)  yields 
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xi+1  -  6  +  c  >/ y/a ,  e  ■=  sign  y  • 

Condition  (26)  will  hold  near  the  solution  under  the  assumptions  of  Theorem  2. 

Remarks .  Rational  interpolations  are  particularly  useful  in  cases  where  f, 
or  its  derivatives,  have  rapid  changes,  even  when  f  has  no  singularities  (see 
section  5). 

Use  of  rational  functions  other  than  (5)  and  (6)  suggests  itself,  especially 
when  higher  degree  interpolation  is  needed,  possibly  combined  with  inverse  interp¬ 
olation  (see  section  4). 
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4.  INVERSE  INTERPOLATION  FOR  LINE  SEARCH 

Inverse  Interpolation  methods  for  the  root  finding  problem  f  (x)  =  0  are 
well  known.  Assuming  that  f'  is  nonzero  and  f^r\x)  is  continuous  on  an 

(f  ) 

interval  J  mapped  by  f  onto  K,  then  f  has  an  inverse  F,  and  F 
is  continuous  on  K.  If  T  is  a  hyperosculatory  interpolating  function  satis¬ 
fying 


(27) 


=  F°c)(y1.:,) 


'i-i 


t-r 

=  f(xi-j> 


j  -  0,  •  «  «  jflJ  k  "  0)  •  •  8  •  1  > 


then 

F(r)(9  (t))-T(r)(e  (t))  n 

(28)  F  (t)  -  T(t)  +  - ! - -T - i -  H  (t-y  )a  , 

r*  j=0  1  J 

with  0i(t)  e  (t.y^  ,  y^  , . . y±  n).  In  the  inverse  interpolation  algorithm 
for  the  root  finding  problem,  we  approximate  a=F(0)  by  xi+^  =  T(0). 

The  derivatives  of  the  inverse  function  F  can  be  expressed  in  terms  of  the 
derivatives  of  f.  Indeed,  letting 


=  pOO 


°k 


=  f 


00 


k  =  1»  2, . 


we  have  (see  [12]) 


n-k.-l  (2n-k. -2)!  -(2n-k. -1)  k.  k 

(29)  ek-I(-l)  n'.k  lk  i-'-k  !  al  •a2’---'a„” 

i  J  n 

where  the  summation  is  taken  over  all  k^,k2»..*»kn  satisfying 


n  n 

E  k  =  n  -  1  ,  E  ik.  -  2n  -  2  ,  k  >  0  . 
i-1  1  i-1  1 
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Let  T  be  a  polynomial  Q 


n,  s 


yi-j  =  30(1  F(yi-j)  =  xi-j  * 


data  x^  j  and  £ 


00 


(x^).  If  T 


the  Interpolating  polynomial  Q 


n,  s 


system 


of  degree  <  r.  By  the  above  and  since 
Qn  s  can  be  expressed  in  terms  of  the 
is  not  a  polynomial,  we  first  construct 
satisfying  (27),  and  proceed  to  solve  the 


"  Qn^s^l-j^  5=  k*  0 . s-l  , 


yl-j  ■ 


Traub  [19]  shows  that  the  rate  of  convergence  of  the  polynomial  inverse 

interpolation  algorithm  is  given  by  the  positive  root  of  the  (root  finding) 

indicial  equation  t  ■  s  E  t  -  0,  exactly  as  in  the  case  of  direct  polynomial 

j=0 

interpolation.  Similar  to  our  derivation  in  section  2,  it  can  be  shown  that  the 
rate  of  convergence  is  independent  of  the  interpolating  class  of  functions. 

Inverse  interpolation  has  not  been  applied  so  far  to  the  solution  of  line 
search  problems.  We  will  define  the  g -inverse  interpolation  algorithm,  and 
prove  that  under  the  appropriate  assumptions,  its  rate  of  convergence  is  given 
by  the  positive  solution  of  the  indicial  equation  (11). 

A  difficulty  in  applying  inverse  interpolation  to  the  line  search  problem 
is  that  one  cannot  assume  that  f  has  an  inverse  near  an  extremum  point  a, 
since  necessarily  f'(a)*0.  Denoting,  however,  g=f'  we  can  write  equation 
(1)  as  g(a)*0.  Assuming  that  a  is  a  simple  zero  of  g,  g  has  an  inverse 
G  defined  on  a  neighborhood  of  g(a).  Since  the  solution  a  of  (1)  satisfies 
g(a)  =  0,  it  is  given  by 


(30) 


a  =  G(0)  . 
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The  assumption  on  the  differentiability  of  g  implies  that  G  is  differentiable. 
Hence  G  is  continuous  and  has  a  primitive  function  F  (i.e. ,  an  indefinite  in¬ 
tegral  of  G),  satisfying 

(31)  F*(t)-G(t). 

Equation  (31)  determines  F  up  to  an  additive  constant.  By  (30)  a  is  given  in 
terms  of  any  solution  F  of  (31)  by 

(32)  a  =  F'  (0)  . 


Now  let  F  be  any  solution  of  (31),  and  let  T  be  a  hyperosculatory  interpolating 
function  satisfying 


(y^_j)  —  F  (y^j)  i  =  0 . .  k  =  0,  ...,s-l  , 

yi-j  *  ■ 


The  inverse  interpolation  process  for  the  solution  of  (1)  consists  of  approximating 
a  in  (32)  by 


(34) 


T* (0)  . 


Evidently,  x^+^  38  defined  by  (34)  is  independent  of  the  particular  integration 

constant  associated  with  F.  Let  Q  be  the  interpolation  polynomial  of  degree 

Tl}  S 

<  r  satisfying  (33).  We  will  later  express  (34)  in  this  case  in  terms  of  the  data. 

If  T  is  not  a  polynomial,  we  can  express  equations  (33)  in  terms  of  the  data  by 

first  constructing  Q  (i.e.,  replace  T  by  Q  in  (33))  and  then  interpolate 

n,  s  ti|  s 

Q_  _  by  T  (i.e.,  replace  F  by  Q  in  (33)). 

1*1  s  n,  8 

In  order  to  write  (34)  explicitly  in  the  polynomial  case,  we  can  proceed  to 
construct  Pn  800,  the  direct  interpolating  polynomial  determined  by  (9),  differ¬ 
entiate  it  to  obtain 
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pi  .<X)  =  S  CL  (x-x  )l 
n»s  k=0  K  1 


from  which  we  obtain  directly  by  [12]  the  inverse  interpolation  formula  for 


line  search 


(35)  Vi  =  xi  +  . %(K1o> 

k=l 

where  is  given  in  terms  of  the  a^'s  in  (29). 

Remark .  If  P(x)  is  a  quadratic  (which  is  the  case  for  the  classical  Newton, 
False  Position  and  Quadratic  Fit  methods),  P'(x)  is  a  linear  function,  with 
linear  inverse,  so  that  in  this  case  the  direct  and  inverse  interpolation  formulas 
coincide . 

The  inverse  interpolation  formula  for  s  *  4,  n  =  0  (with  rate  3),  will  differ 
by  the  above  argument  from  the  direct  interpolation  formula  for  this  case.  It  is 


given  by 


ft  i  <fP2*i' 

Vl-*t-f--2  3  ■ 


Note  that  omitting  the  term  with  the  third  order  derivative  in  (36)  yields 
Newton's  method.  Note  also  that  the  direct  interpolation  formula  in  this  case  is 
given  by  the  solution  of  the  quadratic  equation 


=  °»  where 

p0  3(x>  m  f±  +  +  k(^)2q  +  |-(x-x1)3f^"  . 


He  now  turn  to  the  analysis  of  rate  of  convergence  of  this  class  of  algorithms, 
starting  with  the  derivation  of  a  basic  difference  equation. 
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Theorem  7 .  Let  f "  ^  0  and  let  f^r+^,  T^r+^  be  continuous  on  an  interval  J. 

Let  the  derivative  G  of  F  be  the  inverse  of  g=f',  and  let 

x.  ,  x.  , . . x.  e  J,  where  x^  =T'(0)  and  T  satisfies  (33).  Then 
i+i  l  l  -n  l+i 


(37) 


where 


'l+1  =  j"o  +  1‘  A'*1*1  ' 


K1-k  =  (-i)rsK1(?1(0))[f’’(fli_k)]s"1  n  [f’^e^j))8  , 

J* 


L  =  (-I)r+1L  (n  (0))  n  [f[(e  )]s  , 

j-o  ^ 


F (r) (x) -T  (x)  F (r+1) (x)-T (r+1) (x) 

V*)  - - »  V*>  - - (m?f - * 


?i<t)  *  n1(t)  €  (t.y^  ,  yt-1  »  •  •  •»  yi-n)  »  and  9^.  e  (x^j  ,  a) 


Proof .  The  proof  is  similar  to  the  proof  of  Theorem  1  and  will  be  omitted. 

□ 


The  interested  reader  can  find  the  proof  of  Theorem  7  as  well  as  the  inverse 
interpolation  formulas  for  the  cases  n=l,  s  =  2  and  n=l,  s  =  3  in  [1]. 
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The  following  theorem  characterizes  the  behavior  of  the  inverse  interpolator^ 
process  for  the  line  search  problem  (1). 

Theorem  8.  Let  f  and  T  have  continuous  derivatives  of  order  r  + 1,  in  the 
interval  J  =  (x:  |x-a|  <L).  Let  f"(a)^0.  If  L  is  small  enough,  if 
Xq  ,  . . . ,  e  J  and  the  sequence  (x^)  is  constructed  by  the  inverse  interpolation 
algorithm  for  line  search  (i.e.  x^+^=T*(0),  where  T  satisfies  (27)),  then 

x^+^  e  J.  Furthermore,  if  the  algorithm  does  not  terminate,  and  the  sequence  (K^) 
defined  in  Theorem  7  is  bounded,  then  x^  a  with  Q-  and  R-rates  of  convergence 
at  least  p,  where  p  is  the  unique  positive  root  of 

1  n-1  , 

(38)  t  -(s-l)tn-s£  t3  =  0  . 

j=0 

Proof .  The  proof  is  identical  with  the  proof  of  Theorem  5. 

Equation  (38)  is  identical  of  course  with  equation  (11)  which  is  the  indicial 
equation  of  the  derived  difference  equation  (15). 
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5 .  NUMERICAL  EXAMPLES 

The  purpose  of  this  section  is  to  illustrate  that  the  theoretical  rate  of 
convergence  predicted  by  the  preceding  theorems,  is  well  reflected  in  the  actual 
behavior  of  the  various  direct  and  inverse  algorithms.  Ten  algorithms  (without 
safeguards)  are  applied  to  minimizing  two  functions: 


f  (x)  =  g-  x^  -  +  2x 


and 


f  (x)  =  x  + 


x-l 
e  -1 


The  first  function,  although  nonsingular,  behaves  very  much  like  a  singular 

one  in  the  interval  [0,2],  due  to  rapid  changes  of  f  and  its  derivatives  in 

the  interval.  This  in  particular  caused  the  cubic  fit  method  to  diverge. 

The  second  function  is  highly  singular  at  x  = 1 .  For  this  function,  three 

of  the  methods  based  on  polynomial  interpolation  diverged.  In  contrast,  all  four 

methods  based  on  rational  interpolation  worked  well. 

The  results  are  summarized  in  Tables  5.1,  and  5.2.  The  Rational  and  Conic 

2  2 

functions  referred  to  in  these  tables  are  and  S5 - £.  respectively. 

dx'1  (dx+1)2 


Initial  values  used  in  Table  5.1  are  {2},  {2,2.1},  {1.9,2, 2.1),  and  {2,2.1,2.3,2.31 
according  to  the  number  of  interpolation  points.  Initial  values  used  in  Table  5.2 
are  {1.75},  {1.7, 1.8},  {1.7,1.75,1.8},  and  {1.7,1.73,1.77,1.8}.  The  Qp(x)'s  are  the 

i!*i+rx*ii  * 

quotients  - ; — ,  where  x  are  the  solutions  1.120742611  and  1.962423650 

II  v*  f 

respectively. 

The  algorithms  were  coded  in  APLSF-V01  ou  a  Digital  Equipment  Corporation  DEC-10 
computer,  using  double  precision  arithmetic.  We  stopped  when  |f'(x)|  <  10  ,  ex¬ 

cept  for  the  Quadratic  Fit  algorithm  which  was  terminated  after  12  steps  on  the  first 


function. 
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The  following  theorem  characterizes  the  behavior  of  the  Inverse  Interpolator^ 
process  for  the  line  search  problem  (1). 

Theorem  8.  Let  f  and  T  have  continuous  derivatives  of  order  r  +  1.  In  the 
Interval  J={x:  |x-a|  <L).  Let  f"(a)?<0.  If  L  Is  small  enough.  If 
xrt,...,x  e  J  and  the  sequence  {x. }  Is  constructed  by  the  Inverse  Interpolation 
algorithm  for  line  search  (i.e.  x^+^ * T '(0),  where  T  satisfies  (27)),  then 

x^+^  e  J.  Furthermore,  if  the  algorithm  does  not  terminate,  and  the  sequence  (K^) 
defined  in  Theorem  7  is  bounded,  then  x^  >  a  with  Q-  and  R-rates  of  convergence 
at  least  p,  where  p  is  the  unique  positive  root  of 

.  n-1 

(38)  t  -  (s-l)tn  -  s  E  tJ  =  0  . 

j=0 

Proof.  The  proof  is  identical  with  the  proof  of  Theorem  5. 

Equation  (38)  is  identical  of  course  with  equation  (11)  which  is  the  indicial 
equation  of  the  derived  difference  equation  (15). 
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TABLE  5.1:  Solution  of  f'(x)  *  0  ,  f  (x)  =  g-  - x"*  +  2x  25. 


Algorithm 

Iterations 

No. 

f'(x) 

* 

x  -  x 

Qp(x) 

0  1 

2.961 1 01000E 1 

9.792573887**1 

1  1 

6 . 728546951*0 

5.529105297* "1 

5.684783867**1 

21 

4 .972863922*0 

4 .863864194**1 

1.066328870*0 

Polynomial 
(Quadratic  Fit) 

31 

2.882038163*0 

3.815942441**1 

9.914324335**1 

4  1 

1.346951822^0 

2.647918769**1 

9.487772154**1 

5! 

7 .684644291**1 

1.976246895**1 

1  . 149025754E0 

61 

3.992167669**1 

1.364989189**1 

1  .  169340571*0 

Data:  f  at  3  points 

71 

1 .997530860**1 

8.876782058**2 

1.241553450*0 

81 

1 .005668367**1 

5. 5357 3073 2 E“2 

1 »  3691 1 4301E0 

91 

4.71863544B*“2 

3 . 104070036**2 

1 .435060514E0 

101 

2 .034304172**2 

1 .523054207**2 

1 .515237702E0 

Rate:  1.3 

11  1 

7.613347509**3 

6. 175096344E“3 

1 .577732360E0 

121 

2.210373360**3 

1  .865705167**3 

1.576220344*0 

Rational 


Data:  f  at  4  points 


Rate:  1.4 


01 

5.049343000E1  1 

1 . 179257389E0  1 

1  1 

3.709091269*0  t 

4.277185667**1  1 

3 

. 359004532E*1 

21 

2 . 281 684841E0  1 

3.422946382*' 1  1 

1 

. 1 88404 1 78*0 

31 

1 .040389540E0  1 

2. 320417711**1  1 

1 

.116698663*0 

4  1 

3.936613400E~1  1 

1 .353720412**1  1 

1 

. 151693577*0 

51 

1 .297609&63E-1  1 

6.634405486**2  1 

1 

,243392255*0 

6  1 

5.030661751**2  1 

3 . 267536431**2  1 

1 

.741620451*0 

7  1 

1 .550631811E“2  1 

1 » 194496127**2  1 

1 

.  797630915*0 

81 

3,409830478**3  1 

2.851697807**3  1 

1 

.875539531*0 

91 

4, 146773605E"4  1 

3.550433290**4  1 

1 

.905497807*0 

10  1 

2 , 060 1 971 38E~5  1 

1 . 769587929**5  1 

••> 

.012243570*0 

111 

2.394566790E“7  1 

2.057134096**7  1 

1 

.896026810*0 

121 

3.616562856**101 

3.106937485**101 

1 

♦959848404*0 

Polynomial 

(Newton) 


Data:  f,  f,  f"  at 
1  point 

Rate:  2 


01 

2.200000000*1  1 

8.792573887**1  1 

1  1 

6,8111 35228*0  1 

5 ► 557279770**1  1 

7,188366439**1 

21 

2 , 0371 1 8820*0  1 

3.243497512**1.  1 

1 .050241190*0 

31 

5.799609462**1  1 

1,692501469**1  1 

1,608799476*0 

4  1 

1.528615287**1  1 

7.426608718**2  1 

2.592581600*0 

51 

3.407897860**2  1 

2.375875756**2  1 

4 , 307672124*0 

6  1 

4,649413295**3  1 

3.852395563**3  1 

6 , 824697 750*0 

71 

1.546338457**4  1 

1.326761876**4  1 

8.939870659*0 

81 

1 .945599880**7  1 

1 .671434406**9  I 

9,495183683*0 

91 

3.094629587**131 

2.658559137**13 1 

9,516289591*0 
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TABLE  5.1  (continued):  Solution  of  f ’ (x)  =  0  ,  f(x)  =  ~  x6  -  x3  +  2x 

6 


Algorithm 


Iterations 


No. 


f'(x) 


X  -  X 


qp(x) 


Rational 


Data;  f,  f'  at  2  points 


Rate:  2 


0  1 

2.961 101 000E 1 

1 

9 , 7925738875” 1  1 

1 1 

2 , 810902098E0 

1 

3.772360335E"1  1 

3.933865033E~1 

21 

8.691731487E~1 

1 

2.111 1 29278E~ 1  1 

1 .483503206E0 

31 

1 .523995812E"1 

1 

7.411334564E’’2  1 

1.662902405*0 

41 

2.864145851E“2 

1 

2. 05088498 7E  “2  1 

3 . 733777795*0 

51 

2.435379612E-3 

1 

2. 052034818 F~3  1 

4 .878677532*0 

61 

3.036533579E-5 

1 

2.607995338E"5  1 

6 . 1935173?2E0 

71 

4.41 1 705579E~9 

1 

3.790033105E_9  1 

5.572234465*0 

Inverse  Polynomial 


Data:  f,  f'  at  2  points 


Rate:  2 


01 

2.961101000E1 

1  9 . 792573887**1  1 

1  1 

5 , 4869441 38E0 

1  5.073422766E_1  1 

5.290629378*” 

21 

1 .395258096E-1 

1  6.976612359E"2  1 

2. 710456778* * 

31 

6.937587796F~1 

1  ”4 . 193049971E~1  1 

~8. 614713 769*1 

4  1 

1  .  185756774*~1 

1  ~  2 . 0  0  0  6  7  6  8  6  0 f "  "  J  1 

*1 . 1 379334 73E0 

5 1 

2. 169400795E-2 

1  ‘1.399295964*1  I 

*3. 49587 3 294 E0 

61 

8.943116552E~4 

1  *1 .21631 4003E-1  1 

6.211 929863E0 

71 

1 .741909068E~6 

1  ‘1 .207443531E“1  I 

“8 . 16 160206 1E  <• 

81 

6.889045107E~12I  “1 . 207426 1 1 3E~ 1  1 

~ 8, 281841 3 27 E0 

* 

01 

2 . 961 101000E1  1 

9. 79257388, 7  *~1  1 

Conic 

1  1 

2.216090916E0  1 

3.376086171**1  I 

3.520625326*“1 

21 

6. 181358832E~1  1 

1 .75387749 4E~1  1 

1 .538/64680E0 

Data: 

f,  f'  at  2  points 

31 

9,276089320**2  1 

5.219436396*  ‘2  1 

1 . 6  9  6  7  7  8  2  9  2  E  0 

4  1 

1 .397758065E"2  1 

1 .086966678*"2  1 

3. 9899643 15*0 

51 

6.087564621E~4  1 

5,203953372E~4  1 

4 . 4  0  4  5  4  3  8  2  0  E  0 

Rate: 

2 

61 

1.681494620**6  1 

1 .444528151*’ 6  1 

5.33407621 4*0 

71 

1  . 126799608E-11 1 

9 . 680174631**12 1 

4 . 6  3  9  0  7  2  6  3  5  E  0 

Inverse  Polynomial 

01 

2.961101000*1  I 

9 . 792573887E’’i  1 

Data: 

f  r  i  fit 

1  1 

2 . 803096043*0  1 

3  *  76 7533945* ~ 1  1 

4 . 012052457E_1 

l9  l  ft 

21 

2. 2 3 6092670 *~1  1 

9.5S0086875E-2  1 

1 . 78581 2261 E0 

at  2  points 

3  1 

9.894209873E“3  1 

7.90<>627329*“3  1 

9,070675067*0 

Rate: 

3 

41 

1 . 778781896**  6  1 

1.528103896**6  1 

3 , 09861881 0E0 

51 

”8. 673617380* "191 

0  1 

0 

Inverse  Polynomial 

Data:  at 

1  point 

Rate:  3 


01  2.200000000^1 
II  3.896702674^0 
21  6 , 3743996 1 6E~  1 

31  8. 957458010 E “2 

41  7.053276768‘--3 

51  3.047780125^-5  I 

61  3»552651662E_12I 


8, 792573887 E“1 
4 . 3  7  2  0  3 1 4  4  9  E  ~  1 
1 . 784081 162E’l 
5 .  0871 6354  7E "2 
5 , 743407965t"“3 
2.617652282E'5 
3.052032585'*’  12 


6.431839487E~1 
2. 134837269^0 
8.958429431*0 
4 . 36257 1583*1 
t  .  38166  5  6  4  .1  f  2 
.1 .70158362  5 E2 
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TABLE  5.1  (continued):  Solution  of  f '  (x)  =  0,  f(x)  =  g-  x  -  xJ  +  2x 


Algorithm 


Iterations 


No. 


f'(x) 


X  -  X 


y*> 


Rational 

01 

2 . 200000000E 1  | 

8.792573887E-1  | 

1  1 

2 . 418567463E-0  1 

3 . 51 7996559p“ 1  | 

5. 175440627E“ 

Data: 

f, 

at 

2  1 

2. 47896299 5 E~1  | 

1 .019923738E- 1  I 

2.342510081EO 

1  point 

3  1 

1 .858685452E-2  1 

1 . 405709876E-2  1 

1 .324928972E1 

4  1 

1 .651599946E“4  1 

1 .416954968E-4  1 

5.101 1601 70E 1 

Rate: 

3 

51 

2.300722529E-10I 

1.976517860E-10I 

6.947564682E1 

TABLE 

5.2: 

Solution 

of  f'  (x)  -  0  ,  f(x) 

=  x  + 1/ (ex-1-l) 

Algorltlim 


No. 

f’(x) 

* 

X  -  X 

Qp(x) 

Iterations 


Polynomial 
(Quadratic  Fit) 

Data:  f  at  3  points 


Rate:  1.3 


01  ~4.817670946E“1 
II  “1 .595324554E~1 
21  -6.867413860E~2 
31  ~1 .980314467E~2 
41  -3.13A116528E-3 
51  ~3.604098014E“4 
61  ~1 .616424827^-5 
71  “2.762386271E-7 
81  “2. 808337059E-? 


,624236501E~1 
‘6.425557875E-2 
"2.931057467E-2 
_8.735336800E"3 
“ 1 .399442557E-3 
"1 .611395109E-4 
“7.228789794E-6 
"1 .235376457E-7 
“1 . 25592650 3 E” 9 


7.138022810E“l 
1 .  U2259825E0 
9 . 376484668E" 1 
7.467562847E~l 
9. 727620112 E~1 
7 . 6463321 65tr~  1 
7.981553749E-1 
1 .779843092E0 


Rational 

Data:  f  at  4  points 
Rate:  1.4 


01 

"4.817670946E-1  | 

"1 ,624236501E"l  | 

1  1 

"1 .793161898E“4  1 

"8.018257350E-5  1 

1 .150614171E-3 

21 

“4 .0934371 44E“5  | 

"1 .830588290E-5  1 

1 .842693963E1 

3  1 

“1 .806096932E-8  1 

~8.077110$18E~9  1 

7.083809281E-2 

41 

“1.465487887E-12I 

"6.553780955E-13I 

4.753184221E"! 

Polynomial 

(Newton) 

Data:  at  one 

point 
Rate:  2 


01 
1  I 
21 
31 
41 
51 


I  '6.527012224E-2  I 


'6 . 96736890 1 e~ l  |  ~2. 124236501E"1  | 

'1 .623305337E-1 
‘1 ,470965849E“2  I 
1 .480594988E“4  I 
'1 .534158280E-8  I 


'6.511 392281 e_3 
'6.620735904E-5 
'6.860964325E-9 


1.647987302'  “16  I  “  6  •  548581 122E“1 7  I 


1 .446467539E0 
1 .528428081E0 
1 .561559526E0 
1 .565210065E0 
1 .391 159384EQ 
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TABLE  5.2  (continued):  Solution  of  f'(x)  *  0  ,  f(x)  *  x  + l/Ce*"1-!) 

f 


Algorithm 


Rational 

Data:  f,  f'  at  2  points 
Rate:  2 


Iterations 


No. 


f’(x) 


X  -  x 


OJ  “4 . 81 7670946E 
II  "1.7S4343274E 
21  “2.441789574E 
31  "1 .864827737E 


1  I 
“4  I 
'8  I 
"171 


'I .624236501E“1 
'7.844698287E_5 
1  .092001475E~8 
0 


QpCx) 


i 

I  2.973566893*^3 
I  1.774478474 E0 
I  0 


Conic 

Data:  f,  f '  at  2  points 
Rate:  2 


01  ~4.81767Q946E~1  I  ~1 . 624236501 E‘ 1  I 
1!  1 .617959193E-2  I  7 . 318732791E "3  I 
21  ”1. 324951 426E_4  I  ~ 5 . 9248 1 34 1 1E~5  I 
31  1 » 862712947E“9  I  8. 330305638E~10 I 


2.774197391E_1 
1 . 106121656E0 
2.373075636E~1 


Inverse  Polynomial 

01 

-6.967368901E-1  | 

~2.124236501E"1  1 

f  f  f"  f'" 

1 1 

"5.067809765E-2  1 

"2.189048593^-2  1 

2.283740747*0 

at  one  point 

21 

~6.296381781E~5  1 

-2*815703434E"5  1 

2 . 684236045E0 

31 

“1.364355677E~13I 

-6.100762256**141 

2 . 732897665*0 

Rate: 

3 

Rational 


Data: 

Rate: 


at  one  point 
3 


01 
1  I 
21 


"6,967368901E“1  1 
~1 ,846146248E"4  I 
-1.397579968E"14I 


~2. 124236501E~1  I 
“8. 25515021 4E~5  I 
"6.24196874 7E* 151 


8.612245055* 
1 . 109549416* 


29. 


6.  CONCLUDING  REMARKS 

Our  analysis  points  to  the  inefficiency  of  interpolation  algorithms  based 
on  more  than  two  interpolation  points  (or  more  than  three  points  if  function 
values  only  are  used).  Two-point  algorithms  are  significantly  faster  than  one- 
point  algorithms,  the  latter  are  therefore  useful  only  if  computation  of  the 
derivatives  of  f  are  relatively  very  cheap. 

Use  of  inverse  interpolation  is  recommended  if  equation  (10)  is  difficult 
to  solve.  Note  that  even  in  the  Cubit  Fit  case  where  the  interpolating  function 
is  a  cubic,  solution  of  equation  (10)  involves  computation  of  square  roots  (see 
[10.  p.  142]),  in  itself  a  relatively  costly  operation  on  the  computer. 

Moreover,  our  results  allow  the  design  of  a  one-dimensional  minimization 
algorithm  without  regard  to  the  choice  of  the  r-parameter  family  of  interpolating 
functions  used.  Special  structure  of  the  problem  can  be  taken  advantage  of  by 
using  appropriate  r-parameter  interpolants  with  the  assurance  that  the  rate  of 
convergence  will  not  be  impaired.  Also,  when  combined  with  a  safeguarding  technique, 
one  might  want  to  compute  several  guesses  to  the  minimizer  based  upon  different 
r-parameter  interpolation  of  the  same  data,  and  then  use  the  "best"  guess.  This 
can  be  used  to  great  advantage,  for  example,  when  one  guess  is  outside  the  interval 
of  uncertainty  or  undefined.  Clearly,  the  rate  of  convergence  will  be  unhampered 
even  though  Iterates  may  be  selected  from  a  (finite)  number  of  r-parameter  families 
of  Interpol ants. 

Note  that  the  procedure  of  safeguarding  by  bracketing  as  suggested  in  [10, 
section  7.3],  may  severely  affect  the  rate  of  convergence,  since  the  basic  difference 
equations  may  be  fundamentally  changed  by  such  modifications. 

Assume,  for  example,  that  we  modify  the  Quadratic  Fit  algorithm  so  that  one  of 
the  points  Xj^  »  x^  *  x^_i  »  xi_2  (not  necessarily  x^^)  *8  dropped,  such  a  manner 
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that  the  remaining  points  bracket  the  solution  u..  Then  we  may  choose 
e3  <  0  <  e2  <  e!  and  small  enough  L,  such  that  equation  (15)  of  Theorem  4  would 
imply  that,  for  M  >  0,  we  have  ei+1  >  0  for  all  i.  Hence,  in  the  bracketing 
algorithm,  one  of  the  three  interpolation  points  is  fixed  as  x^  ,  and  in  the 
difference  relation  (15)  one  of  the  indexes  should  be  replaced  by  3,  leading  to 
difference  equation  with  an  indicial  equation  different  than  (11) . 

Thus  the  statement  in  Tamir  [17],  that  bracketing  algorithms  do  not  lend 
themselves  to  the  difference  equation  approach,  and  the  conjecture  made  there 
that  the  interpolation  and  the  bracketing  algorithms  have  the  same  rates  of 
convergence,  are  both  false. 

A  bracketing  procedure  that  aims  at  maintaining  the  rate  of  convergence  of 
the  underlying  interpolation,  should  coincide  with  it  near  the  solution. 
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